Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

several improvements #80

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

matthieu637
Copy link

fix: rprop learning
new layer type (lecun, relu, leaky relu)
L2/L1 norm & regularization
possibility to define own error function
possibility to weight each sample

@troiganto
Copy link

Before merging this, I think the three new activation functions should be documented in fann_data.h. Also, corresponding variants should be added to the C++ enum FANN::activation_function_enum in fann_data_cpp.h.

@@ -1,4 +1,4 @@
########### install files ###############

INSTALL_FILES( /include FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h compat_time.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h )
INSTALL_FILES( /include/fann FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bukka Removing compat_time.h seems fine, but do we really want headers to be installed in a "fann" subdirectory?

Copy link

@troiganto troiganto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bukka @andersfylling @steffennissen I've gone through the commits on this PR and added my comments where appropriate. Since I'm still new to this repository, I wouldn't mind another pair of eyes going over this. Feedback on this review would be great as well, if possible. (:

Overall, most of the changes seem alright to me, though they lack doc comments. I'm somewhat worried about commit a3863a70, which adds labelled weights to irpropm training, and commit cc684290, which adds irpropm training with a custom error function. They copy-paste a lot of code in order to make miniscule changes very deep in the call stack. That seems like a maintenance hazard to me. Is this fine to merge or do we prefer a more general solution, maybe based on new options on the struct fann?

@@ -212,6 +212,7 @@ FANN_EXTERNAL void FANN_API fann_train_on_file(struct fann *ann, const char *fil
This function appears in FANN >= 1.2.0.
*/
FANN_EXTERNAL float FANN_API fann_train_epoch(struct fann *ann, struct fann_train_data *data);
FANN_EXTERNAL float FANN_API fann_train_epoch_lw(struct fann *ann, struct fann_train_data *data, fann_type* label_weight);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function and the others fann_*_lw() functions should get documentation in the same style as the rest.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely also want to add train_epoch_lw() in some way to the FANN::neural_net class in fann_cpp.h.

case FANN_TRAIN_RPROP:
return fann_train_epoch_irpropm_lw(ann, data, label_weight);
default:
printf("FANN : fann_train_epoch_lw not implemented with others algo\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to call:

fann_error((struct fann_error *) ann, FANN_E_CANT_USE_TRAIN_ALG);

here and fall through to the return statement.

neuron_diff2 = (float) (neuron_diff * neuron_diff);
#endif

ann->MSE_value += neuron_diff2 * label_weight;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got no experience with the FIXEDFANN mode, do we have to divide label_weight by (float) ann->multiplier here?

@@ -213,6 +213,9 @@ FANN_EXTERNAL void FANN_API fann_train_on_file(struct fann *ann, const char *fil
*/
FANN_EXTERNAL float FANN_API fann_train_epoch(struct fann *ann, struct fann_train_data *data);
FANN_EXTERNAL float FANN_API fann_train_epoch_lw(struct fann *ann, struct fann_train_data *data, fann_type* label_weight);
FANN_EXTERNAL float FANN_API fann_train_epoch_irpropm_gradient(struct fann *ann, struct fann_train_data *data, fann_type (*errorFunction)(fann_type*,fann_type*,int,void*),void*);
FANN_EXTERNAL void FANN_API fann_compute_MSE_gradient(struct fann *, fann_type *, fann_type (*errorFunction)(fann_type*,fann_type*,int,void*), void*);
FANN_EXTERNAL void FANN_API fann_backpropagate_MSE_firstlayer(struct fann *);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fann_train_epoch_irpropm_gradient() needs documentation.
fann_compute_MSE_gradient() is an internal function and should not be mentioned in this header.
fann_backpropagate_MSE_firstlayer() does not seem to serve any purpose at all. Do we really need it?

last_layer_begin->activation_steepness, neuron_value,
last_layer_begin->sum) * neuron_diff;
//printf("DEBUG _mse_grad %lf %lf %lf %lf\n", *error_it, neuron_diff, last_layer_begin->activation_function, last_layer_begin->activation_steepness );
//desired_output++;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commented-out sections should likely be removed before merging the code.

@@ -443,6 +444,7 @@ struct fann *fann_create_from_fd(FILE * conf, const char *configuration_file)
fann_scanf("%u", "network_type", &tmpVal);
ann->network_type = (enum fann_nettype_enum)tmpVal;
fann_scanf("%f", "learning_momentum", &ann->learning_momentum);
fann_scanf("%f", "learning_l2_norm", &ann->learning_l2_norm);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for fann_save_internal_fd() further up.

@@ -769,7 +769,8 @@ FANN_EXTERNAL float FANN_API fann_get_learning_momentum(struct fann *ann);
This function appears in FANN >= 2.0.0.
*/
FANN_EXTERNAL void FANN_API fann_set_learning_momentum(struct fann *ann, float learning_momentum);

FANN_EXTERNAL float FANN_API fann_get_learning_l2_norm(struct fann *ann);
FANN_EXTERNAL void FANN_API fann_set_learning_l2_norm(struct fann *ann, float learning_l2_norm);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are external and should be documented.

@@ -491,6 +491,7 @@ struct fann

/* The learning momentum used for backpropagation algorithm. */
float learning_momentum;
float learning_l2_norm;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is learning_l2_norm the best name for this parameter? It's actually not the L2 norm, but rather the weight of the L2 norm relative to the error function.

@@ -978,7 +978,7 @@ void fann_update_weights_irpropm(struct fann *ann, unsigned int first_weight, un
for(; i != past_end; i++)
{
prev_step = fann_max(prev_steps[i], (fann_type) 0.0001); /* prev_step may not be zero because then the training will stop */
slope = train_slopes[i];
slope = train_slopes[i] - ann->learning_l2_norm * weights[i];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only adds the L2 regularization to the irpropm algorithm. The same term should be added to the other fann_update_weights*() functions.

SET(PARALLEL_INCLUDES parallel_fann.h parallel_fann.hpp)
ENDIF(NOT OPENMP_FOUND OR DISABLE_PARALLEL_FANN)

install (FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h fann_data_cpp.h fann_training_data_cpp.h ${PARALLEL_INCLUDES} DESTINATION ${INCLUDE_INSTALL_DIR})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency's sake, install should be INSTALL here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This INSTALL command was added in CMake 2.4, which I think is old enough for us to assume that everyone has it. The CMakeLists.txt file itself does not seem to require any minimum CMake version.

@bukka
Copy link
Member

bukka commented May 23, 2018

@troiganto Nice review!

I agree that the docs should be updated and c++ headers added. I think it would be also good to have some tests (googletest) or at least some examples before it gets merged.

@andersfylling
Copy link
Contributor

Affects issue #72

@mrrstrat
Copy link

mrrstrat commented Feb 3, 2020

Is there anyone keeping this alive?

@troiganto
Copy link

Hi @mrrstrat,

I'm still here! I've been writing my doctoral thesis this past year and so swamped with it that I didn't really have time for side projects like this one. I also haven't heard from the other maintainers in a while, unfortunately …

I'd really like to get back to FANN once this is over. (It should be soon. If you haven't heard from me in ~2 months, feel free to ping again.)

Regarding this particular PR, it seems the submitter abandoned it after I requested some changes. I'm not particularly sure what the best process is here. I guess someone else has to incorporate the changes and resubmit the diff as a new PR?

@mrrstrat
Copy link

mrrstrat commented Feb 3, 2020

@troiganto,

I have worked with FANN since about 2004 - a couple years ago I manually put in some of the ReLU changes discussed but did not get a lot of comparative regression tests against other transfer function types. Its probably wishful thinking to 100% rely on a rectified linear function to completely solve a dying gradient problem. Handling this in the past involved changing network architecture, network and system parameters, sampling, scope, training, restructure the problem and desired solution. Indeterminate slope derivation can be a funny animal to corral :-).

It looks like the time elapsed enough that another PR is needed, re-introduction of the changes (the main branch and the changes are no longer contemporary with one another but I might be wrong).

@bukka bukka mentioned this pull request Dec 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants